Prove It To Yourself With The CIL

23 June 2019

This blog will give an introduction to CIL, and give some examples on where it might help while teaching your friends.

I’m a natural sceptic, a characteristic that has its ups and downs - I will freely admit that I don’t know if the earth really is round because I haven’t seen the whole thing yet (much to the delight of some colleagues). On the positive side of things, a sceptic is often willing to dig a little deeper to be sure an answer is correct, which can lead to many happy hours poking around blogs, documentation and sometimes even Stack Overflow.

This scepticism really hit me when I was shoring up my knowledge on the fundamentals of computer science in anticipation of a friend spending the day in my front room for a computer science 101 boot camp (we did have a reason – the day wasn’t just for fun). How was I prove any bold claims about programming? Claims such as the C# conditional AND operator is short-circuiting, and the compiler adds a default constructor if we don’t specify one? I’d always accepted these as gospel, but had I ever bothered to check for myself... i couldn't remember? Enter one of my favourite sayings.

Read the code CIL.

Common Intermediate Language (CIL) is the language spat out by the C# compiler when we smash that F5 button. It is a low-level language that resembles a half-half curry of assembler and high-level languages. It is a platform and CPU agnostic language that will be just-in-time compiled to real machine instructions by a platform specific Common Language Runtime (CLR) at a later time.

CIL targets a stack-based machine, meaning most instructions will either push or pop a value from the stack. The operands of instructions are (usually) stored on the stack. I want to stress that we don’t have to be CIL experts to use it for reasoning about our .NET code - we just need to keep an open mind and try to avoid getting bogged down into unnecessary details. If you want to code along I’d highly suggest downloading LINQPad, who’s IL tab shows the CIL generated when we run a snippet and was used to write this post. Documentation on each instruction can be found here, or hover over each instruction for more detail. It should also be noted that all examples are compiled without optimisations and in LINQPad 5.

Example - Declaring and initializing a variable

There are generally two or three parts on each CIL line (from left to right):

  • IL_XXXX - This is a label which can be used to refer to lines, you can mostly ignore these or think of them as line numbers.
  • The instruction, for example "ldc.i4" and "stloc.0" in fig1.
  • The operands to the instruction, example "80 00 00 00" in fig1 is the operand to the load constant instruction. Operands might also live on the stack, example in the add instruction.

Example - Calling a method. Notice how i + 10 is evaluated before the call instruction.

So cool but there’s nothing too interesting going on here, the CIL and C# resemble each other very closely. Let’s look at some more examples…

Claim: The C# compiler adds a default, parameter-less constructor to a class if we don’t explicitly specify one.

Don’t believe me? I can prove it…

Well would you look at that, with one simple class we can gain some insight into what the C# compiler does for us behind the scenes. We’ve convinced ourselves that the compiler really is adding in a default constructor - MyClass..ctor its right there! In the past to show this I might have given some wishy-washy reasoning such as “well look – we can initialize this class without an explicit constructor, so there probably has been one added”. Now I can show it really is there!

Another insight we can gain from this example is the call to the System.Object constructor. Anyone who has sat through OOP-101 can tell you constructors are called down the inheritance hierarchy, from the least derived (always System.Object,in C#) to the most derived (assuming single inheritance!). Now I can show this concept to all my friends without wasting time writing a class and a superclass who both print out their class name in the constructor. Just read the CIL and we’ll have time for a pint too.

In summary, we can see the embellishments made by the compiler to our class:

Claim: The C# conditional AND operator is short-circuiting.

Don’t believe me? I can prove it… sometimes…

An operator is thought of as short-circuiting if it does not necessarily have to evaluate all its operands. Take the logical AND operator, which is elegantly described in the C# 5.0 specification as:

Conditional AND (x && y): 	Evaluates y only if x is true

Everyone’s favourite demonstration that captures this short-circuiting nature is to write a couple of side affecting methods, call them in the place of the operands x and y above then observe…

The example above only writes the string literal “SideEffectOne” to the console meaning SideEffectTwo was never evaluated or invoked, but maybe we just got lucky. Better explore the CIL to be sure… (notice how 0 represents false)

Perhaps its just me, but I love how devious the compiler gets here, it has pulled a fast one – generating code to evaluate a conditional AND without using a single CIL and instruction. Let’s convince ourselves this is all above-board.

Even though SideEffectOne is guaranteed to return false, the compiler has generated code covering both possibilities. fig7 considers the scenario where SideEffectOne returns false, fig8 true. In both examples the state of the stack is shown on the far right.

We can see from the first path the compiler was paying attention in its first Boolean Algebra lecture – stylishly utilizing the identity 0 ∧ p ≡ 0, or in English anything ANDed together with false will always equal false. If the result of SideEffectOne is false, the short-circuiting nature of the conditional and operator comes to play – the variable p will always be set to false and SideEffectTwo will never be called due to the brfalse.s skipping right over the call.

When SideEffectOne returns true we suddenly start to care about the return value of the SideEffectTwo – that’s all we care about, in fact. There’s no short-circuiting involved, however we do get to browse the compiler’s bag of Boolean tricks one more time – specifically the identity 1∧p≡p. Anything ANDed with true will always be true, meaning b will always be assigned the value returned by SideEffectTwo, regardless of which way the coin landed.

Recalling what the C# specification has to say about conditional AND: “Evaluates y only if x is true”, we can see this perfectly describes the two paths through our CIL instructions – the second call instruction is only executed if the first returns true. Similar arguments can be made about the conditional OR operator which are of course left as an exercise to the reader.

A strange counter example…

Compiling the following snippet:

We see that both p and q are evaluated without any short-circuiting. I suppose this code doesn’t use short-circuiting as that would add an extra branch instruction increasing the size and complexity of the codebase. This only works as evaluating both operands has no side effect. For completeness sake here’s the CIL generated by LINQPad 4.59.00 demonstrating the extra branch instruction, look familiar?

Claim: Named parameters are evaluated in the order specified at the calling site (rather than the order on the method signature), from left to right.

Don’t believe me? You get the picture…

Introduced in C# 4, named parameters allow a programmer to specify a parameter name at the call site of a method invocation which in turn allows parameters to be passed to the method in a different order to the method definition. The very verbose and uncool example I would usually come up with of demonstrating the order evaluation without whipping out the CIL-scope was:

If you got to this line without reading the code above, excellent work – who has time to read tens of lines when one will do? If you did read the above, you can pretty much forget it… At least we can appreciate how much time creating smaller examples and looking at the CIL saves us.

Parameters are passed to a method on the stack in reverse order, e.g. string.Equals(a, b) will require b on the stack top, and a one below (top of the stack is to the right, remember):

What if we reverse the order of a and b?